Pesquisa | Portal Regional da BVS

1.

Predicting Dementia Risk for Elderly Community Dwellers in Primary Care Services Using Subgroup-specific Prediction Models.

Hang Kwok, Stephen Wai; Sipka, Christine; Matthews, Aled; Lara, Carol Pontes; Wang, Guanjin; Choi, Kup-Sze.

Annu Int Conf IEEE Eng Med Biol Soc ; 2023: 1-4, 2023 07.

Artigo em Inglês | MEDLINE | ID: mdl-38083010

RESUMO

Early detection of individuals with a high risk of dementia is crucial for prompt intervention and clinical care. This study aims to identify high-risk groups for developing dementia by predicting the outcome of the Mini-Mental State Examination (MMSE), using historical data collected from community-based primary care services. To mitigate the effect of inter-individual variability and enhance the accuracy of the prediction, we implemented a multi-stage method powered by supervised and unsupervised machine learning methods. Firstly, we preprocessed the original data by imputing missing values and using a wrapper-based feature selection algorithm to pick significant features, resulting in ten variables out of 567 being selected for further modeling. Secondly, we optimized hierarchical clustering to partition the unlabeled data into groups by their similarities, and then applied supervised machine learning models to build subgroup-specific prediction models for the identified groups. The results demonstrate that the proposed subgroup-specific prediction models generated from the multi-stage method achieved satisfactory performance in predicting the outcome classes of dementia risk. This study highlights the potential of incorporating unsupervised and supervised learning models to predict high-risk cases of dementia early and facilitate better clinical decision-making.

Assuntos

Demência , Aprendizado de Máquina Supervisionado , Humanos , Idoso , Algoritmos , Demência/diagnóstico , Atenção Primária à Saúde

2.

Coupling Machine Learning Models with Innovative Technology-based Screening Tool for Identifying Psychological Distress among Aboriginal Perinatal Mothers.

Kwok, Stephen Wai Hang; Kotz, Jayne; Reibel, Tracy; Wang, Guanjin; Walker, Roz; Marriott, Rhonda.

Annu Int Conf IEEE Eng Med Biol Soc ; 2023: 1-4, 2023 07.

Artigo em Inglês | MEDLINE | ID: mdl-38082817

RESUMO

Aboriginal perinatal mothers are at a significant risk of experiencing mental health problems, which can have profound negative impacts, despite their overall resilience. This work aimed to build prediction models for identifying high psychological distress among Aboriginal perinatal mothers by coupling machine learning models with an innovative and culturally-safe screening tool. The original dataset of 179 Aboriginal mothers with 337 variables was obtained from twelve perinatal health settings at Perth metropolitan and regional centers in Western Australia between July and September 2022, using a specifically designed web-based rubric for the perinatal mental health assessment. After data preprocessing and feature selection, 23 variables related to emotional manifestations, the problematic partner, worries about daily living, and the need for follow-up wraparound support were identified as significant predictors for the high risk of psychological distress measured by the Kessler 5 plus adaptation. The selected predictors were used to train prediction models, and most of the chosen machine learning models achieved satisfactory results, with Random Forest and Support Vector Machine yielding the highest AUC of over 0.95, accuracy over 0.86, and F1 score above 0.87. This study demonstrates the potential of using machine learning-based models in clinical decision-making to facilitate healthcare and social and emotional well-being for Aboriginal families.

Assuntos

Povos Aborígenes Australianos e Ilhéus do Estreito de Torres , Angústia Psicológica , Feminino , Gravidez , Humanos , Mães/psicologia , Austrália Ocidental , Aprendizado de Máquina

3.

A deep multi-view imbalanced learning approach for identifying informative COVID-19 tweets from social media.

Long, Kok Kiang; Kwok, Stephen Wai Hang; Kotz, Jayne; Wang, Guanjin.

Comput Biol Med ; 164: 107232, 2023 09.

Artigo em Inglês | MEDLINE | ID: mdl-37531859

RESUMO

Social media platforms such as Twitter are home ground for rapid COVID-19-related information sharing over the Internet, thereby becoming the favorable data resource for many downstream applications. Due to the massive pile of COVID-19 tweets generated every day, it is significant that the machine-learning-supported downstream applications can effectively skip the uninformative tweets and only pick up the informative tweets for their further use. However, existing solutions do not specifically consider the negative effect caused by the imbalanced ratios between informative and uninformative tweets in training data. In particular, most of the existing solutions are dominated by single-view learning, neglecting the rich information from different views to facilitate learning. In this study, a novel deep imbalanced multi-view learning approach called D-SVM-2K is proposed to identify the informative COVID-19 tweets from social media. This approach is built upon the well-known multiview learning method SVM-2K to incorporate different views generated from different feature extraction techniques. To battle against the class imbalance problem and enhance its learning ability, D-SVM-2K stacks multiple SVM-2K base classifiers in a stacked deep structure where its base classifiers can learn from either the original training dataset or the shifted critical regions identified using the well-known k-nearest neighboring algorithm. D-SVM-2K also realises a global and local deep ensemble learning on the multiple views' data. Our empirical experiments on a real-world labeled tweet dataset demonstrate the effectiveness of D-SVM-2K in dealing with the real-world multi-view class imbalance issues.

Assuntos

COVID-19 , Mídias Sociais , Humanos , Algoritmos , Aprendizado de Máquina , Disseminação de Informação

4.

A Novel AUC Maximization Imbalanced Learning Approach for Predicting Composite Outcomes in COVID-19 Hospitalized Patients.

Wang, Guanjin; Kwok, Stephen Wai Hang; Yousufuddin, Mohammed; Sohel, Ferdous.

IEEE J Biomed Health Inform ; 27(8): 3794-3805, 2023 08.

Artigo em Inglês | MEDLINE | ID: mdl-37227914

RESUMO

The COVID-19 patient data for composite outcome prediction often comes with class imbalance issues, i.e., only a small group of patients develop severe composite events after hospital admission, while the rest do not. An ideal COVID-19 composite outcome prediction model should possess strong imbalanced learning capability. The model also should have fewer tuning hyperparameters to ensure good usability and exhibit potential for fast incremental learning. Towards this goal, this study proposes a novel imbalanced learning approach called Imbalanced maximizing-Area Under the Curve (AUC) Proximal Support Vector Machine (ImAUC-PSVM) by the means of classical PSVM to predict the composite outcomes of hospitalized COVID-19 patients within 30 days of hospitalization. ImAUC-PSVM offers the following merits: (1) it incorporates straightforward AUC maximization into the objective function, resulting in fewer parameters to tune. This makes it suitable for handling imbalanced COVID-19 data with a simplified training process. (2) Theoretical derivations reveal that ImAUC-PSVM has the same analytical solution form as PSVM, thus inheriting the advantages of PSVM for handling incremental COVID-19 cases through fast incremental updating. We built and internally and externally validated our proposed classifier using real COVID-19 patient data obtained from three separate sites of Mayo Clinic in the United States. Additionally, we validated it on public datasets using various performance metrics. Experimental results demonstrate that ImAUC-PSVM outperforms other methods in most cases, showcasing its potential to assist clinicians in triaging COVID-19 patients at an early stage in hospital settings, as well as in other prediction applications.

Assuntos

COVID-19 , Humanos , Área Sob a Curva , Aprendizado de Máquina , Prognóstico , Hospitalização

5.

An artificial intelligence approach for predicting death or organ failure after hospitalization for COVID-19: development of a novel risk prediction tool and comparisons with ISARIC-4C, CURB-65, qSOFA, and MEWS scoring systems.

Kwok, Stephen Wai Hang; Wang, Guanjin; Sohel, Ferdous; Kashani, Kianoush B; Zhu, Ye; Wang, Zhen; Antpack, Eduardo; Khandelwal, Kanika; Pagali, Sandeep R; Nanda, Sanjeev; Abdalrhim, Ahmed D; Sharma, Umesh M; Bhagra, Sumit; Dugani, Sagar; Takahashi, Paul Y; Murad, Mohammad H; Yousufuddin, Mohammed.

Respir Res ; 24(1): 79, 2023 Mar 13.

Artigo em Inglês | MEDLINE | ID: mdl-36915107

RESUMO

BACKGROUND: We applied machine learning (ML) algorithms to generate a risk prediction tool [Collaboration for Risk Evaluation in COVID-19 (CORE-COVID-19)] for predicting the composite of 30-day endotracheal intubation, intravenous administration of vasopressors, or death after COVID-19 hospitalization and compared it with the existing risk scores. METHODS: This is a retrospective study of adults hospitalized with COVID-19 from March 2020 to February 2021. Patients, each with 92 variables, and one composite outcome underwent feature selection process to identify the most predictive variables. Selected variables were modeled to build four ML algorithms (artificial neural network, support vector machine, gradient boosting machine, and Logistic regression) and an ensemble model to generate a CORE-COVID-19 model to predict the composite outcome and compared with existing risk prediction scores. The net benefit for clinical use of each model was assessed by decision curve analysis. RESULTS: Of 1796 patients, 278 (15%) patients reached primary outcome. Six most predictive features were identified. Four ML algorithms achieved comparable discrimination (P > 0.827) with c-statistics ranged 0.849-0.856, calibration slopes 0.911-1.173, and Hosmer-Lemeshow P > 0.141 in validation dataset. These 6-variable fitted CORE-COVID-19 model revealed a c-statistic of 0.880, which was significantly (P < 0.04) higher than ISARIC-4C (0.751), CURB-65 (0.735), qSOFA (0.676), and MEWS (0.674) for outcome prediction. The net benefit of the CORE-COVID-19 model was greater than that of the existing risk scores. CONCLUSION: The CORE-COVID-19 model accurately assigned 88% of patients who potentially progressed to 30-day composite events and revealed improved performance over existing risk scores, indicating its potential utility in clinical practice.

Assuntos

COVID-19 , Adulto , Humanos , COVID-19/diagnóstico , Estudos Retrospectivos , Inteligência Artificial , Escores de Disfunção Orgânica , Hospitalização

6.

Seasonal and Year-Round Distributions of Bactrocera dorsalis (Hendel) and Its Risk to Temperate Fruits under Climate Change.

Dong, Zhaoke; He, Yitong; Ren, Yonglin; Wang, Guanjin; Chu, Dong.

Insects ; 13(6)2022 Jun 16.

Artigo em Inglês | MEDLINE | ID: mdl-35735887

RESUMO

Bactrocera dorsalis (Hendel) is an important pest to fruits and vegetables. It can damage more than 300 plant species. The distribution of B. dorsalis has been expanding owing to international trade and other human activities. B. dorsalis occurrence is strongly related to suitable overwintering conditions and distribution areas, but it is unclear where these seasonal and year-round suitable areas are. We used maximum entropy (MaxEnt) to predict the potential seasonal and year-round distribution areas of B. dorsalis. We also projected suitable habitat areas in 2040 and 2060 under global warming scenarios, such as SSP126 and SSP585. These models achieved AUC values of 0.860 and 0.956 for the seasonal and year-round scenarios, respectively, indicating their good prediction capabilities. The precipitation of the wettest month (Bio13) and the mean diurnal temperature range (Bio2) contributed 83.9% to the seasonal distribution prediction model. Bio2 and the minimum temperature of the coldest month (Bio6) provided important information related to the year-round distribution prediction. In future scenarios, the suitable area of B. dorsalis will increase and the range will expand northward. Four important temperate fruits, namely, apples, peaches, pears, and oranges, will be seriously threatened. The information from this study provides a useful reference for implementing improved population management strategies for B. dorsalis.

7.

A Deep-Ensemble-Level-Based Interpretable Takagi-Sugeno-Kang Fuzzy Classifier for Imbalanced Data.

Wang, Guanjin; Zhou, Ta; Choi, Kup-Sze; Lu, Jie.

IEEE Trans Cybern ; 52(5): 3805-3818, 2022 May.

Artigo em Inglês | MEDLINE | ID: mdl-32946410

RESUMO

Existing research reveals that the misclassification rate for imbalanced data depends heavily on the problematic areas due to the existence of small disjoints, class overlap, borderline, and rare data samples. In this study, by stacking zero-order Takagi-Sugeno-Kang (TSK) fuzzy subclassifiers on the minority class and its problematic areas in the deep ensemble, a novel deep-ensemble-level-based TSK fuzzy classifier (IDE-TSK-FC) for imbalanced data classification tasks is presented to achieve both promising classification performance and high interpretability of zero-order TSK fuzzy classifiers. Simultaneously, according to the stacked generalization principle, the proposed classifier lifts up oversampling from the data level to the deep ensemble level with a guarantee of enhanced generalization capability for class imbalance learning. In the structure of IDE-TSK-FC, the first interpretable zero-order TSK fuzzy subclassifier is built on the original training dataset. After that, several successive zero-order TSK fuzzy subclassifiers are stacked layer by layer on the newly identified problematic areas from the original training dataset plus the corresponding interpretable predictions obtained by the averaging strategy on all previous layers. IDE-TSK-FC simply takes the classical K -nearest neighboring algorithm at each layer to identify its problematic area that consists of the minority samples and its surrounding K majority neighbors. After randomly neglecting certain input features and randomly selecting the five Gaussian membership functions for all the chosen input features and the augmented feature in the premise of each fuzzy rule, each subclassifier can be quickly obtained by using the least learning machine to determine the consequent part of each fuzzy rule. The experimental results on both the public datasets and a real-world healthcare dataset demonstrate IDE-TSK-FC's superiority in class imbalanced learning.

Assuntos

Algoritmos , Lógica Fuzzy , Aprendizagem

8.

Deep Cross-Output Knowledge Transfer Using Stacked-Structure Least-Squares Support Vector Machines.

Wang, Guanjin; Choi, Kup-Sze; Teoh, Jeremy Yuen-Chun; Lu, Jie.

IEEE Trans Cybern ; 52(5): 3207-3220, 2022 May.

Artigo em Inglês | MEDLINE | ID: mdl-32780705

RESUMO

This article presents a new deep cross-output knowledge transfer approach based on least-squares support vector machines, called DCOT-LS-SVMs. Its aim is to improve the generalizability of least-squares support vector machines (LS-SVMs) while avoiding the complicated parameter tuning process that occurs in many kernel machines. The proposed approach has two significant characteristics: 1) DCOT-LS-SVMs is inspired by a stacked hierarchical architecture that combines several layer-by-layer LS-SVMs modules. The module that forms the higher layer has additional input features that consider the predictions from all previous modules and 2) cross-output knowledge transfer is used to leverage knowledge from the predictions of the previous module to improve the learning process in the current module. With this approach, the model's parameters, such as a tradeoff parameter C and a kernel width Î´ , can be randomly assigned to each module in order to greatly simplify the learning process. Moreover, DCOT-LS-SVMs is able to autonomously and quickly decide the extent of the cross-output knowledge transfer between adjacent modules through a fast leave-one-out cross-validation strategy. In addition, we present an imbalanced version of DCOT-LS-SVMs, called IDCOT-LS-SVMs, given that imbalanced datasets are common in real-world scenarios. The effectiveness of the proposed approaches is demonstrated through a comparison with five comparative methods on UCI datasets and with a case study on the diagnosis of prostate cancer.

Assuntos

Máquina de Vetores de Suporte , Análise dos Mínimos Quadrados

9.

Enhancement of prostate cancer diagnosis by machine learning techniques: an algorithm development and validation study.

Chiu, Peter Ka-Fung; Shen, Xiao; Wang, Guanjin; Ho, Cho-Lik; Leung, Chi-Ho; Ng, Chi-Fai; Choi, Kup-Sze; Teoh, Jeremy Yuen-Chun.

Prostate Cancer Prostatic Dis ; 25(4): 672-676, 2022 04.

Artigo em Inglês | MEDLINE | ID: mdl-34267331

RESUMO

BACKGROUND: To investigate the value of machine learning(ML) in enhancing prostate cancer(PCa) diagnosis. METHODS: Consecutive systematic prostate biopsies performed from Jan 2003-June 2017 were used as the training cohort, and prospective biopsies performed from July 2017-November 2019 were used as validation cohort. Men were included if PSA was 0.4-50 ng/mL, and information of digital rectal examination (DRE), Transrectal ultrasound(TRUS) prostate volume, TRUS abnormality were known. Clinically significant PCa(csPCa) was defined as Gleason 3 + 4 or above cancers. Area-under-curve (AUC) of receiver-operating characteristics (ROC) was compared between PSA, PSA density, European Randomized Study of Screening for Prostate Cancer (ERSPC) risk calculator (ERSPC-RC), and various ML techniques using PSA, DRE and TRUS information. ML techniques used included XGBoost, LightGBM, Catboost, Support vector machine (SVM), Logistic regression (LR), and Random Forest (RF), where cost sensitive learning was applied. RESULTS: Training and validation cohorts included 3881 and 778 consecutive men, respectively. RF model performed better than other ML techniques and PSA, PSA density and ERSPC-RC for prediction of PCa or csPCa in the validation cohort. In csPCa prediction, AUC of PSA, PSA density, ERSPC-RC and RF was 0.71, 0.80, 0.83 and 0.88 respectively. At 90-95% sensitivity for csPCa, RF model achieved a negative predictive value (NPV) of 97.5-98.0% and avoided 38.3-52.2% unnecessary biopsies. Decision curve analyses (DCA) showed RF model provided net clinical benefit over PSA, PSA density and ERSPC-RC. CONCLUSION: By using the same clinical parameters, ML techniques performed better than ERSPC-RC or PSA density in csPCa predictions, and could avoid up to 50% unnecessary biopsies.

Assuntos

Próstata , Neoplasias da Próstata , Masculino , Humanos , Próstata/diagnóstico por imagem , Próstata/patologia , Neoplasias da Próstata/patologia , Antígeno Prostático Específico , Estudos Prospectivos , Medição de Risco/métodos , Biópsia/métodos , Aprendizado de Máquina , Algoritmos

10.

Tweet Topics and Sentiments Relating to COVID-19 Vaccination Among Australian Twitter Users: Machine Learning Analysis.

Kwok, Stephen Wai Hang; Vadde, Sai Kumar; Wang, Guanjin.

J Med Internet Res ; 23(5): e26953, 2021 05 19.

Artigo em Inglês | MEDLINE | ID: mdl-33886492

RESUMO

BACKGROUND: COVID-19 is one of the greatest threats to human beings in terms of health care, economy, and society in recent history. Up to this moment, there have been no signs of remission, and there is no proven effective cure. Vaccination is the primary biomedical preventive measure against the novel coronavirus. However, public bias or sentiments, as reflected on social media, may have a significant impact on the progression toward achieving herd immunity. OBJECTIVE: This study aimed to use machine learning methods to extract topics and sentiments relating to COVID-19 vaccination on Twitter. METHODS: We collected 31,100 English tweets containing COVID-19 vaccine-related keywords between January and October 2020 from Australian Twitter users. Specifically, we analyzed tweets by visualizing high-frequency word clouds and correlations between word tokens. We built a latent Dirichlet allocation (LDA) topic model to identify commonly discussed topics in a large sample of tweets. We also performed sentiment analysis to understand the overall sentiments and emotions related to COVID-19 vaccination in Australia. RESULTS: Our analysis identified 3 LDA topics: (1) attitudes toward COVID-19 and its vaccination, (2) advocating infection control measures against COVID-19, and (3) misconceptions and complaints about COVID-19 control. Nearly two-thirds of the sentiments of all tweets expressed a positive public opinion about the COVID-19 vaccine; around one-third were negative. Among the 8 basic emotions, trust and anticipation were the two prominent positive emotions observed in the tweets, while fear was the top negative emotion. CONCLUSIONS: Our findings indicate that some Twitter users in Australia supported infection control measures against COVID-19 and refuted misinformation. However, those who underestimated the risks and severity of COVID-19 may have rationalized their position on COVID-19 vaccination with conspiracy theories. We also noticed that the level of positive sentiment among the public may not be sufficient to increase vaccination coverage to a level high enough to achieve vaccination-induced herd immunity. Governments should explore public opinion and sentiments toward COVID-19 and COVID-19 vaccination, and implement an effective vaccination promotion scheme in addition to supporting the development and clinical administration of COVID-19 vaccines.

Assuntos

Vacinas contra COVID-19/administração & dosagem , Aprendizado de Máquina , Mídias Sociais/estatística & dados numéricos , Vacinação/psicologia , Austrália , COVID-19/epidemiologia , COVID-19/prevenção & controle , COVID-19/psicologia , Humanos , Opinião Pública , SARS-CoV-2/imunologia

11.

Using Dual Neural Network Architecture to Detect the Risk of Dementia With Community Health Data: Algorithm Development and Validation Study.

Shen, Xiao; Wang, Guanjin; Kwan, Rick Yiu-Cho; Choi, Kup-Sze.

JMIR Med Inform ; 8(8): e19870, 2020 Aug 31.

Artigo em Inglês | MEDLINE | ID: mdl-32865498

RESUMO

BACKGROUND: Recent studies have revealed lifestyle behavioral risk factors that can be modified to reduce the risk of dementia. As modification of lifestyle takes time, early identification of people with high dementia risk is important for timely intervention and support. As cognitive impairment is a diagnostic criterion of dementia, cognitive assessment tools are used in primary care to screen for clinically unevaluated cases. Among them, Mini-Mental State Examination (MMSE) is a very common instrument. However, MMSE is a questionnaire that is administered when symptoms of memory decline have occurred. Early administration at the asymptomatic stage and repeated measurements would lead to a practice effect that degrades the effectiveness of MMSE when it is used at later stages. OBJECTIVE: The aim of this study was to exploit machine learning techniques to assist health care professionals in detecting high-risk individuals by predicting the results of MMSE using elderly health data collected from community-based primary care services. METHODS: A health data set of 2299 samples was adopted in the study. The input data were divided into two groups of different characteristics (ie, client profile data and health assessment data). The predictive output was the result of two-class classification of the normal and high-risk cases that were defined based on MMSE. A dual neural network (DNN) model was proposed to obtain the latent representations of the two groups of input data separately, which were then concatenated for the two-class classification. Mean and k-nearest neighbor were used separately to tackle missing data, whereas a cost-sensitive learning (CSL) algorithm was proposed to deal with class imbalance. The performance of the DNN was evaluated by comparing it with that of conventional machine learning methods. RESULTS: A total of 16 predictive models were built using the elderly health data set. Among them, the proposed DNN with CSL outperformed in the detection of high-risk cases. The area under the receiver operating characteristic curve, average precision, sensitivity, and specificity reached 0.84, 0.88, 0.73, and 0.80, respectively. CONCLUSIONS: The proposed method has the potential to serve as a tool to screen for elderly people with cognitive impairment and predict high-risk cases of dementia at the asymptomatic stage, providing health care professionals with early signals that can prompt suggestions for a follow-up or a detailed diagnosis.

12.

A Transfer-Based Additive LS-SVM Classifier for Handling Missing Data.

Wang, Guanjin; Lu, Jie; Choi, Kup-Sze; Zhang, Guangquan.

IEEE Trans Cybern ; 50(2): 739-752, 2020 Feb.

Artigo em Inglês | MEDLINE | ID: mdl-30334775

RESUMO

The performance of a classifier might greatly deteriorate due to missing data. Many different techniques to handle this problem have been developed. In this paper, we solve the problem of missing data using a novel transfer learning perspective and show that when an additive least squares support vector machine (LS-SVM) is adopted, model transfer learning can be used to enhance the classification performance on incomplete training datasets. A novel transfer-based additive LS-SVM classifier is accordingly proposed. This method also simultaneously determines the influence of classification errors caused by each incomplete sample using a fast leave-one-out cross validation strategy, as an alternative way to clean the training data to further improve the data quality. The proposed method has been applied to seven public datasets. The experimental results indicate that the proposed method achieves at least comparable, if not better, performance than case deletion, mean imputation, and k -nearest neighbor imputation methods, followed by the standard LS-SVM and support vector machine classifiers. Moreover, a case study on a community healthcare dataset using the proposed method is presented in detail, which particularly highlights the contributions and benefits of the proposed method to this real-world application.

13.

Diagnosis of prostate cancer in a Chinese population by using machine learning methods.

Wang, Guanjin; Teoh, Jeremy Yuen-Chun; Choi, Kup-Sze.

Annu Int Conf IEEE Eng Med Biol Soc ; 2018: 1-4, 2018 Jul.

Artigo em Inglês | MEDLINE | ID: mdl-30440319

RESUMO

An early diagnosis of prostate cancer (PC) is key for the successful treatment. Although invasive prostate biopsies can provide a definitive diagnosis, the number of biopsies should be reduced to avoid side effects and risks especially for the men with the low risk of cancer. Therefore, an accurate model is in need to predict PC with the aim of reducing unnecessary biopsies. In this study, we developed predictive models using four machine learning methods including Support Vector Machine (SVM), Least Squares Support Vector Machine (LS-SVM), Artificial Neural Network (ANN) and Random Forest (RF) to detect PC cases using available prebiopsy information. The models were constructed and evaluated on a cohort of 1625 Chinese men with prostate biopsies from Hong Kong hospital. All the models have the excellent performances in detecting significant PC cases, with ANN achieving the highest accuracy of 0.9527 and the AUC value of 0.9755. RF outperformed the other three methods in classifying benign, significant and insignificant PC cases, with an accuracy of 0.9741 and a F1 score of 0.8290.

Assuntos

Aprendizado de Máquina , Neoplasias da Próstata/diagnóstico , Idoso , Povo Asiático , Biópsia , Humanos , Análise dos Mínimos Quadrados , Masculino , Pessoa de Meia-Idade , Redes Neurais de Computação , Neoplasias da Próstata/patologia , Máquina de Vetores de Suporte

14.

Tackling Missing Data in Community Health Studies Using Additive LS-SVM Classifier.

Wang, Guanjin; Deng, Zhaohong; Choi, Kup-Sze.

IEEE J Biomed Health Inform ; 22(2): 579-587, 2018 03.

Artigo em Inglês | MEDLINE | ID: mdl-27925597

RESUMO

Missing data is a common issue in community health and epidemiological studies. Direct removal of samples with missing data can lead to reduced sample size and information bias, which deteriorates the significance of the results. While data imputation methods are available to deal with missing data, they are limited in performance and could introduce noises into the dataset. Instead of data imputation, a novel method based on additive least square support vector machine (LS-SVM) is proposed in this paper for predictive modeling when the input features of the model contain missing data. The method also determines simultaneously the influence of the features with missing values on the classification accuracy using the fast leave-one-out cross-validation strategy. The performance of the method is evaluated by applying it to predict the quality of life (QOL) of elderly people using health data collected in the community. The dataset involves demographics, socioeconomic status, health history, and the outcomes of health assessments of 444 community-dwelling elderly people, with 5% to 60% of data missing in some of the input features. The QOL is measured using a standard questionnaire of the World Health Organization. Results show that the proposed method outperforms four conventional methods for handling missing data-case deletion, feature deletion, mean imputation, and K-nearest neighbor imputation, with the average QOL prediction accuracy reaching 0.7418. It is potentially a promising technique for tackling missing data in community health research and other applications.

Assuntos

Bases de Dados Factuais , Informática Médica/métodos , Máquina de Vetores de Suporte , Idoso , Idoso de 80 Anos ou mais , Interpretação Estatística de Dados , Feminino , Humanos , Análise dos Mínimos Quadrados , Masculino , Saúde Pública , Qualidade de Vida

15.

Seizure Classification From EEG Signals Using Transfer Learning, Semi-Supervised Learning and TSK Fuzzy System.

Jiang, Yizhang; Wu, Dongrui; Deng, Zhaohong; Qian, Pengjiang; Wang, Jun; Wang, Guanjin; Chung, Fu-Lai; Choi, Kup-Sze; Wang, Shitong.

IEEE Trans Neural Syst Rehabil Eng ; 25(12): 2270-2284, 2017 12.

Artigo em Inglês | MEDLINE | ID: mdl-28880184

RESUMO

Recognition of epileptic seizures from offline EEG signals is very important in clinical diagnosis of epilepsy. Compared with manual labeling of EEG signals by doctors, machine learning approaches can be faster and more consistent. However, the classification accuracy is usually not satisfactory for two main reasons: the distributions of the data used for training and testing may be different, and the amount of training data may not be enough. In addition, most machine learning approaches generate black-box models that are difficult to interpret. In this paper, we integrate transductive transfer learning, semi-supervised learning and TSK fuzzy system to tackle these three problems. More specifically, we use transfer learning to reduce the discrepancy in data distribution between the training and testing data, employ semi-supervised learning to use the unlabeled testing data to remedy the shortage of training data, and adopt TSK fuzzy system to increase model interpretability. Two learning algorithms are proposed to train the system. Our experimental results show that the proposed approaches can achieve better performance than many state-of-the-art seizure classification algorithms.

Assuntos

Eletroencefalografia/classificação , Lógica Fuzzy , Convulsões/classificação , Aprendizado de Máquina Supervisionado , Transferência de Experiência , Algoritmos , Epilepsia/diagnóstico , Humanos , Modelos Estatísticos , Reconhecimento Automatizado de Padrão , Reprodutibilidade dos Testes , Software , Máquina de Vetores de Suporte

16.

Prediction of mortality after radical cystectomy for bladder cancer by machine learning techniques.

Wang, Guanjin; Lam, Kin-Man; Deng, Zhaohong; Choi, Kup-Sze.

Comput Biol Med ; 63: 124-32, 2015 Aug.

Artigo em Inglês | MEDLINE | ID: mdl-26073099

RESUMO

Bladder cancer is a common cancer in genitourinary malignancy. For muscle invasive bladder cancer, surgical removal of the bladder, i.e. radical cystectomy, is in general the definitive treatment which, unfortunately, carries significant morbidities and mortalities. Accurate prediction of the mortality of radical cystectomy is therefore needed. Statistical methods have conventionally been used for this purpose, despite the complex interactions of high-dimensional medical data. Machine learning has emerged as a promising technique for handling high-dimensional data, with increasing application in clinical decision support, e.g. cancer prediction and prognosis. Its ability to reveal the hidden nonlinear interactions and interpretable rules between dependent and independent variables is favorable for constructing models of effective generalization performance. In this paper, seven machine learning methods are utilized to predict the 5-year mortality of radical cystectomy, including back-propagation neural network (BPN), radial basis function (RBFN), extreme learning machine (ELM), regularized ELM (RELM), support vector machine (SVM), naive Bayes (NB) classifier and k-nearest neighbour (KNN), on a clinicopathological dataset of 117 patients of the urology unit of a hospital in Hong Kong. The experimental results indicate that RELM achieved the highest average prediction accuracy of 0.8 at a fast learning speed. The research findings demonstrate the potential of applying machine learning techniques to support clinical decision making.

Assuntos

Cistectomia , Bases de Dados Factuais , Modelos Biológicos , Máquina de Vetores de Suporte , Neoplasias da Bexiga Urinária/mortalidade , Neoplasias da Bexiga Urinária/cirurgia , Idoso , Intervalo Livre de Doença , Feminino , Seguimentos , Humanos , Masculino , Pessoa de Meia-Idade , Valor Preditivo dos Testes , Taxa de Sobrevida

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA